In [1]:
input_height = 32
input_width = 32

filter_height = 8
filter_width = 8

P = 1
S = 2

new_height = (input_height - filter_height + 2 * P)/S + 1
new_width = (input_width - filter_width + 2 * P)/S + 1

print (new_height)
print (new_width)


14.0
14.0

In [9]:
# Alternate
import tensorflow as tf
input = tf.placeholder(tf.float32, (None, 32, 32, 3))
filter_weights = tf.Variable(tf.truncated_normal((8, 8, 3, 20))) # (height, width, input_depth, output_depth)
filter_bias = tf.Variable(tf.zeros(20))
strides = [1, 2, 2, 1] # (batch, height, width, depth)
padding = 'SAME'
conv = tf.nn.conv2d(input, filter_weights, strides, padding) + filter_bias

In [10]:
print(conv)


Tensor("add_4:0", shape=(?, 16, 16, 20), dtype=float32)

Note the output shape of conv will be [1, 16, 16, 20]. It's 4D to account for batch size, but more importantly, it's not [1, 14, 14, 20]. This is because the padding algorithm TensorFlow uses is not exactly the same as the one above. An alternative algorithm is to switch padding from 'SAME' to 'VALID' which would result in an output shape of [1, 13, 13, 20]. If you're curious how padding works in TensorFlow, read this document.

In summary TensorFlow uses the following equation for 'SAME' vs 'PADDING'

SAME Padding, the output height and width are computed as:

out_height = ceil(float(in_height) / float(strides[1]))

out_width = ceil(float(in_width) / float(strides[2]))

VALID Padding, the output height and width are computed as:

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))

out_width = ceil(float(in_width - filter_width + 1) / float(strides[2]))

 Calculating number of Parameters

We're now going to calculate the number of parameters of the convolutional layer. The answer from the last quiz will come into play here!

Being able to calculate the number of parameters in a neural network is useful since we want to have control over how much memory a neural network uses.

Setup

H = height, W = width, D = depth

We have an input of shape 32x32x3 (HxWxD) 20 filters of shape 8x8x3 (HxWxD) A stride of 2 for both the height and width (S) Zero padding of size 1 (P)

Output Layer

14x14x20 (HxWxD)

Hint - Without parameter sharing, each neuron in the output layer must connect to each neuron in the filter. In addition, each neuron in the output layer must also connect to a single bias neuron.

Solution

There are 756560 total parameters. That's a HUGE amount! Here's how we calculate it: (8 * 8 * 3 + 1) * (14 * 14 * 20) = 756560 8 * 8 * 3 is the number of weights, we add 1 for the bias. Remember, each weight is assigned to every single part of the output (14 * 14 * 20). So we multiply these two numbers together and we get the final answer.

In [ ]: